Skip to content

Conversation

ZuseZ4
Copy link
Member

@ZuseZ4 ZuseZ4 commented Aug 22, 2025

LLVM's offload functionality usually expects an extra dyn_ptr argument. We could avoid it,b ut likely gonna need it very soon in one of the follow-up PRs (e.g. to request shared memory). So we might as well already add it.

This PR adds a %dyn_ptr ptr to GPUKernel ABI functions, if the offload feature is enabled.

WIP

r? @ghost

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 22, 2025
@rust-log-analyzer

This comment has been minimized.

@ZuseZ4
Copy link
Member Author

ZuseZ4 commented Aug 22, 2025

@oli-obk Probably stupid question, but where do set parameter names?
When lowering any GPUKernel Function (including amdgpu_kernel) I add one ptr argument, so for a one-ptr function I instead generate

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(argmem: write)
define amdgpu_kernel void @kernel_1(ptr noundef writeonly captures(none) %x, ptr noundef readnone captures(none) %0) unnamed_addr #0 {
start:
  %1 = addrspacecast ptr %x to ptr addrspace(1)
  store double 2.100000e+01, ptr addrspace(1) %1, align 8
  ret void
}

but want to generate
define amdgpu_kernel void @kernel_1(ptr noundef writeonly captures(none) %0, ptr noundef readnone captures(none) %x) (aka ignore the first argument). My PR (asdf commit) just copies the first arg and the last arg get's a generic name, so I assume we have a list of names on the rust side, which we match starting at the first arg. However, I couldn't find a relevant location where we're calling LLVM functions on the rust side to set those names. Do you have any ideas?

Edit: I decided I'll just do the full rewrite of the module on llvm-ir level instead of MIR, since that's what I know best (and seathlin mentioned MIR is probably too low eithr way). I'll add more details later

@rust-log-analyzer

This comment has been minimized.

@ZuseZ4
Copy link
Member Author

ZuseZ4 commented Aug 27, 2025

I'll clean this up further later to minimze the amount of c++, but I lost a bit of patience with LLVM's C API, so I just did 100% of the work with the C++ API. With this and the previous (review ready) patch, Rust's amdgcn target runs on a GPU, without manual LLVM-IR rewriting. I'll port it back from C++ to Rust.
This should also run on Nvidia or Intel GPUs, I just didn't got to test it yet.

@rust-log-analyzer

This comment has been minimized.

@@ -170,6 +172,27 @@ extern "C" void LLVMRustPrintStatistics(RustStringRef OutBuf) {
llvm::PrintStatistics(OS);
}

extern "C" void LLVMRustOffloadMapper(LLVMModuleRef M, LLVMValueRef OldFn, LLVMValueRef NewFn) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is too niche to be worth wrapping in Rust. We would need to introduce the ValueToValueMapTy, and handle the mapping of one value to another. Plus we'd need to expose the used CloneFunctionInto as well as the CloneFunctionChangeTypes.

@rust-log-analyzer
Copy link
Collaborator

The job tidy failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
All checks passed!
checking python file formatting
27 files already formatted
checking C++ file formatting
/checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp:175:75: error: code should be clang-formatted [-Wclang-format-violations]
extern "C" void LLVMRustOffloadMapper(LLVMModuleRef M, LLVMValueRef OldFn, LLVMValueRef NewFn) {
                                                                          ^

clang-format linting failed! Printing diff suggestions:
--- /checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp (actual)
+++ /checkout/compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp (formatted)
@@ -171,9 +171,10 @@
   auto OS = RawRustStringOstream(OutBuf);
   llvm::PrintStatistics(OS);
 }
 
-extern "C" void LLVMRustOffloadMapper(LLVMModuleRef M, LLVMValueRef OldFn, LLVMValueRef NewFn) {
+extern "C" void LLVMRustOffloadMapper(LLVMModuleRef M, LLVMValueRef OldFn,
+                                      LLVMValueRef NewFn) {
   llvm::Module *module = llvm::unwrap(M);
   llvm::Function *oldFn = llvm::unwrap<llvm::Function>(OldFn);
   llvm::Function *newFn = llvm::unwrap<llvm::Function>(NewFn);
 

tidy error: checks with external tool 'clang-format' failed
some tidy checks failed
Bootstrap failed while executing `test src/tools/tidy tidyselftest --extra-checks=py,cpp,js,spellcheck`
Command `/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools-bin/rust-tidy /checkout /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo /checkout/obj/build 4 /node/bin/npm --extra-checks=py,cpp,js,spellcheck` failed with exit code 1
Created at: src/bootstrap/src/core/build_steps/tool.rs:1548:23
Executed at: src/bootstrap/src/core/build_steps/test.rs:1296:29

Command has failed. Rerun with -v to see more details.
Build completed unsuccessfully in 0:01:16
  local time: Mon Sep  1 07:41:41 UTC 2025
  network time: Mon, 01 Sep 2025 07:41:41 GMT
##[error]Process completed with exit code 1.
Post job cleanup.

}

let consider_offload = config.offload.contains(&config::Offload::Enable);
if consider_offload && (cgcx.target_arch == "amdgpu" || cgcx.target_arch == "nvptx64") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should probably be combined to a target_is_gpu, similar to target_is_like_darwin and target_is_like_aix

@ZuseZ4
Copy link
Member Author

ZuseZ4 commented Sep 1, 2025

r? @oli-obk

This adds two more commits on top of the other pr which fixes the host code generation.
Together, these 4 commits are enough to compile a rust function and run it on a gpu, I left instructions in the dev guide for it: https://rustc-dev-guide.rust-lang.org/offload/usage.html

@ZuseZ4 ZuseZ4 marked this pull request as ready for review September 1, 2025 08:07
@rustbot
Copy link
Collaborator

rustbot commented Sep 1, 2025

oli-obk is not on the review rotation at the moment.
They may take a while to respond.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 1, 2025
@ZuseZ4 ZuseZ4 mentioned this pull request Jul 24, 2025
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants